Search Results

Documents authored by Zheng, Yu


Found 4 Possible Name Variants:

Zheng, Yu

Document
On Relaxed Locally Decodable Codes for Hamming and Insertion-Deletion Errors

Authors: Alexander R. Block, Jeremiah Blocki, Kuan Cheng, Elena Grigorescu, Xin Li, Yu Zheng, and Minshen Zhu

Published in: LIPIcs, Volume 264, 38th Computational Complexity Conference (CCC 2023)


Abstract
Locally Decodable Codes (LDCs) are error-correcting codes C:Σⁿ → Σ^m, encoding messages in Σⁿ to codewords in Σ^m, with super-fast decoding algorithms. They are important mathematical objects in many areas of theoretical computer science, yet the best constructions so far have codeword length m that is super-polynomial in n, for codes with constant query complexity and constant alphabet size. In a very surprising result, Ben-Sasson, Goldreich, Harsha, Sudan, and Vadhan (SICOMP 2006) show how to construct a relaxed version of LDCs (RLDCs) with constant query complexity and almost linear codeword length over the binary alphabet, and used them to obtain significantly-improved constructions of Probabilistically Checkable Proofs. In this work, we study RLDCs in the standard Hamming-error setting, and introduce their variants in the insertion and deletion (Insdel) error setting. Standard LDCs for Insdel errors were first studied by Ostrovsky and Paskin-Cherniavsky (Information Theoretic Security, 2015), and are further motivated by recent advances in DNA random access bio-technologies. Our first result is an exponential lower bound on the length of Hamming RLDCs making 2 queries (even adaptively), over the binary alphabet. This answers a question explicitly raised by Gur and Lachish (SICOMP 2021) and is the first exponential lower bound for RLDCs. Combined with the results of Ben-Sasson et al., our result exhibits a "phase-transition"-type behavior on the codeword length for some constant-query complexity. We achieve these lower bounds via a transformation of RLDCs to standard Hamming LDCs, using a careful analysis of restrictions of message bits that fix codeword bits. We further define two variants of RLDCs in the Insdel-error setting, a weak and a strong version. On the one hand, we construct weak Insdel RLDCs with almost linear codeword length and constant query complexity, matching the parameters of the Hamming variants. On the other hand, we prove exponential lower bounds for strong Insdel RLDCs. These results demonstrate that, while these variants are equivalent in the Hamming setting, they are significantly different in the insdel setting. Our results also prove a strict separation between Hamming RLDCs and Insdel RLDCs.

Cite as

Alexander R. Block, Jeremiah Blocki, Kuan Cheng, Elena Grigorescu, Xin Li, Yu Zheng, and Minshen Zhu. On Relaxed Locally Decodable Codes for Hamming and Insertion-Deletion Errors. In 38th Computational Complexity Conference (CCC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 264, pp. 14:1-14:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{block_et_al:LIPIcs.CCC.2023.14,
  author =	{Block, Alexander R. and Blocki, Jeremiah and Cheng, Kuan and Grigorescu, Elena and Li, Xin and Zheng, Yu and Zhu, Minshen},
  title =	{{On Relaxed Locally Decodable Codes for Hamming and Insertion-Deletion Errors}},
  booktitle =	{38th Computational Complexity Conference (CCC 2023)},
  pages =	{14:1--14:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-282-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{264},
  editor =	{Ta-Shma, Amnon},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2023.14},
  URN =		{urn:nbn:de:0030-drops-182847},
  doi =		{10.4230/LIPIcs.CCC.2023.14},
  annote =	{Keywords: Relaxed Locally Decodable Codes, Hamming Errors, Insdel Errors, Lower Bounds}
}
Document
Track A: Algorithms, Complexity and Games
Linear Insertion Deletion Codes in the High-Noise and High-Rate Regimes

Authors: Kuan Cheng, Zhengzhong Jin, Xin Li, Zhide Wei, and Yu Zheng

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)


Abstract
This work continues the study of linear error correcting codes against adversarial insertion deletion errors (insdel errors). Previously, the work of Cheng, Guruswami, Haeupler, and Li [Kuan Cheng et al., 2021] showed the existence of asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, or achieve rate arbitrarily close to 1/2 even over the binary alphabet. As shown in [Kuan Cheng et al., 2021], these bounds are also the best possible. However, known explicit constructions in [Kuan Cheng et al., 2021], and subsequent improved constructions by Con, Shpilka, and Tamo [Con et al., 2022] all fall short of meeting these bounds. Over any constant size alphabet, they can only achieve rate < 1/8 or correct < 1/4 fraction of errors; over the binary alphabet, they can only achieve rate < 1/1216 or correct < 1/54 fraction of errors. Apparently, previous techniques face inherent barriers to achieve rate better than 1/4 or correct more than 1/2 fraction of errors. In this work we give new constructions of such codes that meet these bounds, namely, asymptotically good linear insdel codes that can correct arbitrarily close to 1 fraction of errors over some constant size alphabet, and binary asymptotically good linear insdel codes that can achieve rate arbitrarily close to 1/2. All our constructions are efficiently encodable and decodable. Our constructions are based on a novel approach of code concatenation, which embeds the index information implicitly into codewords. This significantly differs from previous techniques and may be of independent interest. Finally, we also prove the existence of linear concatenated insdel codes with parameters that match random linear codes, and propose a conjecture about linear insdel codes.

Cite as

Kuan Cheng, Zhengzhong Jin, Xin Li, Zhide Wei, and Yu Zheng. Linear Insertion Deletion Codes in the High-Noise and High-Rate Regimes. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 41:1-41:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{cheng_et_al:LIPIcs.ICALP.2023.41,
  author =	{Cheng, Kuan and Jin, Zhengzhong and Li, Xin and Wei, Zhide and Zheng, Yu},
  title =	{{Linear Insertion Deletion Codes in the High-Noise and High-Rate Regimes}},
  booktitle =	{50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
  pages =	{41:1--41:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-278-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{261},
  editor =	{Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.41},
  URN =		{urn:nbn:de:0030-drops-180931},
  doi =		{10.4230/LIPIcs.ICALP.2023.41},
  annote =	{Keywords: Error correcting code, Edit distance, Pseudorandomness, Derandomization}
}
Document
Lower Bounds and Improved Algorithms for Asymmetric Streaming Edit Distance and Longest Common Subsequence

Authors: Xin Li and Yu Zheng

Published in: LIPIcs, Volume 213, 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)


Abstract
In this paper, we study edit distance (ED) and longest common subsequence (LCS) in the asymmetric streaming model, introduced by Saks and Seshadhri [Saks and Seshadhri, 2013]. As an intermediate model between the random access model and the streaming model, this model allows one to have streaming access to one string and random access to the other string. Meanwhile, ED and LCS are both fundamental problems that are often studied on large strings, thus the (asymmetric) streaming model is ideal for studying these problems. Our first main contribution is a systematic study of space lower bounds for ED and LCS in the asymmetric streaming model. Previously, there are no explicitly stated results in this context, although some lower bounds about LCS can be inferred from the lower bounds for longest increasing subsequence (LIS) in [Sun and Woodruff, 2007; Gál and Gopalan, 2010; Ergun and Jowhari, 2008]. Yet these bounds only work for large alphabet size. In this paper, we develop several new techniques to handle ED in general and LCS for small alphabet size, thus establishing strong lower bounds for both problems. In particular, our lower bound for ED provides an exponential separation between edit distance and Hamming distance in the asymmetric streaming model. Our lower bounds also extend to LIS and longest non-decreasing subsequence (LNS) in the standard streaming model. Together with previous results, our bounds provide an almost complete picture for these two problems. As our second main contribution, we give improved algorithms for ED and LCS in the asymmetric streaming model. For ED, we improve the space complexity of the constant factor approximation algorithms in [Farhadi et al., 2020; Cheng et al., 2020] from Õ({n^δ}/δ) to O({d^δ}/δ polylog(n)), where n is the length of each string and d is the edit distance between the two strings. For LCS, we give the first 1/2+ε approximation algorithm with space n^δ for any constant δ > 0, over a binary alphabet. Our work leaves a plethora of intriguing open questions, including establishing lower bounds and designing algorithms for a natural generalization of LIS and LNS, which we call longest non-decreasing subsequence with threshold (LNST).

Cite as

Xin Li and Yu Zheng. Lower Bounds and Improved Algorithms for Asymmetric Streaming Edit Distance and Longest Common Subsequence. In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 213, pp. 27:1-27:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.FSTTCS.2021.27,
  author =	{Li, Xin and Zheng, Yu},
  title =	{{Lower Bounds and Improved Algorithms for Asymmetric Streaming Edit Distance and Longest Common Subsequence}},
  booktitle =	{41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)},
  pages =	{27:1--27:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-215-0},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{213},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Chekuri, Chandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2021.27},
  URN =		{urn:nbn:de:0030-drops-155381},
  doi =		{10.4230/LIPIcs.FSTTCS.2021.27},
  annote =	{Keywords: Asymmetric Streaming Model, Edit Distance, Longest Common Subsequence, Space Lower Bound}
}
Document
Track A: Algorithms, Complexity and Games
Streaming and Small Space Approximation Algorithms for Edit Distance and Longest Common Subsequence

Authors: Kuan Cheng, Alireza Farhadi, MohammadTaghi Hajiaghayi, Zhengzhong Jin, Xin Li, Aviad Rubinstein, Saeed Seddighin, and Yu Zheng

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)


Abstract
The edit distance (ED) and longest common subsequence (LCS) are two fundamental problems which quantify how similar two strings are to one another. In this paper, we first consider these problems in the asymmetric streaming model introduced by Andoni, Krauthgamer and Onak [Andoni et al., 2010] (FOCS'10) and Saks and Seshadhri [Saks and Seshadhri, 2013] (SODA'13). In this model we have random access to one string and streaming access the other one. Our main contribution is a constant factor approximation algorithm for ED with memory Õ(n^δ) for any constant δ > 0. In addition to this, we present an upper bound of Õ _ε(√n) on the memory needed to approximate ED or LCS within a factor 1±ε. All our algorithms are deterministic and run in polynomial time in a single pass. We further study small-space approximation algorithms for ED, LCS, and longest increasing sequence (LIS) in the non-streaming setting. Here, we design algorithms that achieve 1 ± ε approximation for all three problems, where ε > 0 can be any constant and even slightly sub-constant. Our algorithms only use poly-logarithmic space while maintaining a polynomial running time. This significantly improves previous results in terms of space complexity, where all known results need to use space at least Ω(√n). Our algorithms make novel use of triangle inequality and carefully designed recursions to save space, which can be of independent interest.

Cite as

Kuan Cheng, Alireza Farhadi, MohammadTaghi Hajiaghayi, Zhengzhong Jin, Xin Li, Aviad Rubinstein, Saeed Seddighin, and Yu Zheng. Streaming and Small Space Approximation Algorithms for Edit Distance and Longest Common Subsequence. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 54:1-54:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{cheng_et_al:LIPIcs.ICALP.2021.54,
  author =	{Cheng, Kuan and Farhadi, Alireza and Hajiaghayi, MohammadTaghi and Jin, Zhengzhong and Li, Xin and Rubinstein, Aviad and Seddighin, Saeed and Zheng, Yu},
  title =	{{Streaming and Small Space Approximation Algorithms for Edit Distance and Longest Common Subsequence}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{54:1--54:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.54},
  URN =		{urn:nbn:de:0030-drops-141236},
  doi =		{10.4230/LIPIcs.ICALP.2021.54},
  annote =	{Keywords: Edit Distance, Longest Common Subsequence, Longest Increasing Subsequence, Space Efficient Algorithm, Approximation Algorithm}
}

Zheng, Yudi

Document
An Empirical Study on Deoptimization in the Graal Compiler

Authors: Yudi Zheng, Lubomír Bulej, and Walter Binder

Published in: LIPIcs, Volume 74, 31st European Conference on Object-Oriented Programming (ECOOP 2017)


Abstract
Managed language platforms such as the Java Virtual Machine or the Common Language Runtime rely on a dynamic compiler to achieve high performance. Besides making optimization decisions based on the actual program execution and the underlying hardware platform, a dynamic compiler is also in an ideal position to perform speculative optimizations. However, these tend to increase the compilation costs, because unsuccessful speculations trigger deoptimization and recompilation of the affected parts of the program, wasting previous work. Even though speculative optimizations are widely used, the costs of these optimizations in terms of extra compilation work has not been previously studied. In this paper, we analyze the behavior of the Graal dynamic compiler integrated in Oracle's HotSpot Virtual Machine. We focus on situations which cause program execution to switch from machine code to the interpreter, and compare application performance using three different deoptimization strategies which influence the amount of extra compilation work done by Graal. Using an adaptive deoptimization strategy, we managed to improve the average start-up performance of benchmarks from the DaCapo, ScalaBench, and Octane benchmark suites, mostly by avoiding wasted compilation work. On a single-core system, we observed an average speed-up of 6.4% for the DaCapo and ScalaBench workloads, and a speed-up of 5.1% for the Octane workloads; the improvement decreases with an increasing number of available CPU cores. We also find that the choice of a deoptimization strategy has negligible impact on steady-state performance. This indicates that the cost of speculation matters mainly during start-up, where it can disturb the delicate balance between executing the program and the compiler, but is quickly amortized in steady state.

Cite as

Yudi Zheng, Lubomír Bulej, and Walter Binder. An Empirical Study on Deoptimization in the Graal Compiler. In 31st European Conference on Object-Oriented Programming (ECOOP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 74, pp. 30:1-30:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{zheng_et_al:LIPIcs.ECOOP.2017.30,
  author =	{Zheng, Yudi and Bulej, Lubom{\'\i}r and Binder, Walter},
  title =	{{An Empirical Study on Deoptimization in the Graal Compiler}},
  booktitle =	{31st European Conference on Object-Oriented Programming (ECOOP 2017)},
  pages =	{30:1--30:30},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-035-4},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{74},
  editor =	{M\"{u}ller, Peter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2017.30},
  URN =		{urn:nbn:de:0030-drops-72583},
  doi =		{10.4230/LIPIcs.ECOOP.2017.30},
  annote =	{Keywords: dynamic compiler, profile-guided optimization, deoptimization}
}

Zheng, Yufan

Document
Track A: Algorithms, Complexity and Games
On the Degree of Boolean Functions as Polynomials over ℤ_m

Authors: Xiaoming Sun, Yuan Sun, Jiaheng Wang, Kewen Wu, Zhiyu Xia, and Yufan Zheng

Published in: LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)


Abstract
Polynomial representations of Boolean functions over various rings such as ℤ and ℤ_m have been studied since Minsky and Papert (1969). From then on, they have been employed in a large variety of areas including communication complexity, circuit complexity, learning theory, coding theory and so on. For any integer m ≥ 2, each Boolean function has a unique multilinear polynomial representation over ring ℤ_m. The degree of such polynomial is called modulo-m degree, denoted as deg_m(⋅). In this paper, we investigate the lower bound of modulo-m degree of Boolean functions. When m = p^k (k ≥ 1) for some prime p, we give a tight lower bound deg_m(f) ≥ k(p-1) for any non-degenerate function f:{0,1}ⁿ → {0,1}, provided that n is sufficient large. When m contains two different prime factors p and q, we give a nearly optimal lower bound for any symmetric function f:{0,1}ⁿ → {0,1} that deg_m(f) ≥ n/{2+1/(p-1)+1/(q-1)}.

Cite as

Xiaoming Sun, Yuan Sun, Jiaheng Wang, Kewen Wu, Zhiyu Xia, and Yufan Zheng. On the Degree of Boolean Functions as Polynomials over ℤ_m. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 100:1-100:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{sun_et_al:LIPIcs.ICALP.2020.100,
  author =	{Sun, Xiaoming and Sun, Yuan and Wang, Jiaheng and Wu, Kewen and Xia, Zhiyu and Zheng, Yufan},
  title =	{{On the Degree of Boolean Functions as Polynomials over \mathbb{Z}\underlinem}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{100:1--100:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.100},
  URN =		{urn:nbn:de:0030-drops-125070},
  doi =		{10.4230/LIPIcs.ICALP.2020.100},
  annote =	{Keywords: Boolean function, polynomial, modular degree, Ramsey theory}
}

Zheng, Hongyu

Document
Exact Transcript Quantification Over Splice Graphs

Authors: Cong Ma, Hongyu Zheng, and Carl Kingsford

Published in: LIPIcs, Volume 172, 20th International Workshop on Algorithms in Bioinformatics (WABI 2020)


Abstract
The probability of sequencing a set of RNA-seq reads can be directly modeled using the abundances of splice junctions in splice graphs instead of the abundances of a list of transcripts. We call this model graph quantification, which was first proposed by Bernard et al. (2014). The model can be viewed as a generalization of transcript expression quantification where every full path in the splice graph is a possible transcript. However, the previous graph quantification model assumes the length of single-end reads or paired-end fragments is fixed. We provide an improvement of this model to handle variable-length reads or fragments and incorporate bias correction. We prove that our model is equivalent to running a transcript quantifier with exactly the set of all compatible transcripts. The key to our method is constructing an extension of the splice graph based on Aho-Corasick automata. The proof of equivalence is based on a novel reparameterization of the read generation model of a state-of-art transcript quantification method. This new approach is useful for modeling scenarios where reference transcriptome is incomplete or not available and can be further used in transcriptome assembly or alternative splicing analysis.

Cite as

Cong Ma, Hongyu Zheng, and Carl Kingsford. Exact Transcript Quantification Over Splice Graphs. In 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 172, pp. 12:1-12:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{ma_et_al:LIPIcs.WABI.2020.12,
  author =	{Ma, Cong and Zheng, Hongyu and Kingsford, Carl},
  title =	{{Exact Transcript Quantification Over Splice Graphs}},
  booktitle =	{20th International Workshop on Algorithms in Bioinformatics (WABI 2020)},
  pages =	{12:1--12:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-161-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{172},
  editor =	{Kingsford, Carl and Pisanti, Nadia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2020.12},
  URN =		{urn:nbn:de:0030-drops-128013},
  doi =		{10.4230/LIPIcs.WABI.2020.12},
  annote =	{Keywords: RNA-seq, alternative splicing, transcript quantification, splice graph, network flow}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail